An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
نویسندگان
چکیده
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from the Hidden Markov Model (HMM) of the speech recognizer. We also present a strategy to address the case of a combination with Cepstral Mean Normalization (CMN), where the HMM for speech recognizer is obtained using CMN-processed speech database. Experimental results demonstrate that the proposed method is effective at providing a comparable speech recognition performance to the matched data condition where the clean speech database is available for GMM training which is also used for HMM training for speech recognizer. This proves that the SVM-based model selection method is able to effectively generate Gaussian components from the pre-trained HMMmodel parameters to make the GMM for the feature compensation method be tightly matched to the speech recognizer, resulting in robust speech recognition performance with various types of background noise conditions.
منابع مشابه
Feature compensation in the cepstral domain employing model combination
In this paper, we present an effective cepstral feature compensation scheme which leverages knowledge of the speech model in order to achieve robust speech recognition. In the proposed scheme, the requirement for a prior noisy speech database in off-line training is eliminated by employing parallel model combination for the noise-corrupted speech model. Gaussian mixture models of clean speech a...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملIVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition
In our previous work, we proposed a feature compensation approach using high-order vector Taylor series approximation for noisy speech recognition. In this paper, first we improve the feature compensation in both efficiency and accuracy by boosted mixture learning of GMM, applying higher order information of VTS approximation only to the noisy speech mean parameters, acoustic context expansion,...
متن کاملExploring high-performance speech recognition in noisy environments using high-order taylor series expansion
In this paper, high-order Taylor Series expansion is proposed to explore the most effective formulas of log-spectral compensation. The power feature, which is crucial to speech recognition in noisy environments and can’t be compensated in usual feature compensation, is processed similarly to spectral subtraction. The modeling accuracy of speech logspectral Gaussian Mixture Model (GMM) is also d...
متن کاملApplications of Missing Feature Theory to Speaker Recognition
An important problem in speaker recognition is the degradation that occurs when speaker models trained with speech from one type of channel are used to score speech from another type of channel, known as channel mismatch. This thesis investigates various channel compensation techniques and approaches from missing feature theory for improving Gaussian mixture model (GMM)-based speaker verificati...
متن کامل